AITopics | label shift

Collaborating Authors

label shift

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Conformal Bayes for Two-Sided Censored Gaussian Regression under Label Shift

Choi, Seungjin

arXiv.org Machine LearningJul-3-2026

Prediction under label shift becomes nonstandard when responses are censored. In a two-sided censored Gaussian model, latent values below $L$ and above $U$ are recorded at the boundary values, so the observed predictive distribution is mixed, with atoms at $L$ and $U$ and a continuous density on $(L,U)$. In this paper we develop conformal Bayes for this mixed-space setting by combining posterior predictive tilting with weighted conformal calibration. Under a two-sided Tobit Gaussian Bayesian prediction head with a Laplace posterior approximation, the tilted predictive distribution has left-atom, interior, and right-atom components, with a three-term closed-form normalizer. The resulting prediction set is a mixed highest density region that can combine boundary atoms with an interior interval and can reduce to atom-only sets under strong censoring. The main technical issue is that latent label shift does not directly give an ordinary density ratio on the observed censored scale. A latent exponential tilt induces tail-averaged atom weights at the censored boundaries, while the interior ratio remains density based. This yields a mixed observed-space calibration weight with two atom ratios and one interior density ratio. The weight corrects the calibration measure, while predictive tilting gives target-adapted mixed-HDR geometry. Synthetic experiments show that weighted tilted conformal Bayes restores marginal coverage with smaller sets than weighted source-score calibration, while revealing a trade-off between marginal coverage and component-wise behavior across atoms and interior observations.

artificial intelligence, machine learning, two-sided censored gaussian regression, (13 more...)

arXiv.org Machine Learning

2607.02173

Genre: Research Report > New Finding (0.34)

Industry: Law > Civil Rights & Constitutional Law (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Add feedback

Conformal Bayes under Label Shift: Post-Hoc Calibration vs. In-Training Adaptation

Choi, Seungjin

arXiv.org Machine LearningJun-29-2026

Conformal Bayes combines Bayesian posterior predictives with conformal calibration to produce prediction sets that are both statistically valid and geometrically efficient. We study conformal Bayes under label shift from a unified perspective, identifying two complementary approaches that restore nominal target-domain coverage through importance-weighted conformal calibration but operate through independent mechanisms. \emph{Post-hoc calibration} tilts the posterior predictive toward the target domain and corrects the conformal threshold via an importance-weighted quantile, leaving the parameter posterior unchanged. \emph{In-training adaptation} tilts the parameter posterior itself to the target domain, producing a corrected predictive whose highest predictive density region serves as the highest predictive density (HPD)-based prediction set under the fitted target predictive; efficiency is model-dependent and does not imply finite-sample conditional optimality. Two controlled experiments isolate the regime-dependence of each strategy: in the low-dimensional, well-estimated regime Strategy~A produces the narrowest valid intervals, while in the high-dimensional, underdetermined regime Strategy~B achieves up to $43\%$ width reduction at unchanged coverage, under the stated source-sampling and label-shift assumptions.

artificial intelligence, dtr, machine learning, (16 more...)

arXiv.org Machine Learning

2606.11865

Country: Asia > South Korea (0.14)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.46)

Add feedback

Non-exchangeable Conformal Prediction with Optimal Transport: Tackling Distribution Shifts with Unlabeled Data

Neural Information Processing SystemsJun-23-2026, 00:51:47 GMT

Conformal prediction is a distribution-free uncertainty quantification method that has gained popularity in the machine learning community due to its finite-sample guarantees and ease of use. Its most common variant, dubbed split conformal prediction, is also computationally efficient as it boils down to collecting statistics of the model predictions on some calibration data not yet seen by the model. Nonetheless, these guarantees only hold if the calibration and test data are exchangeable, a condition that is difficult to verify and often violated in practice due to so-called distribution shifts. The literature is rife with methods to mitigate the loss in coverage in this non-exchangeable setting, but these methods require some prior information on the type of distribution shift to be expected at test time. In this work, we study this problem via a new perspective, through the lens of optimal transport, and show that it is possible to estimate the loss in coverage and mitigate arbitrary distribution shifts, offering a principled and broadly applicable solution.

artificial intelligence, machine learning, prediction, (19 more...)

Neural Information Processing Systems

Country: Europe > Netherlands (0.27)

Genre: Research Report > Experimental Study (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.92)

Add feedback

Statistically Valid Post-Deployment Monitoring Should Be Standard for AI-Based Digital Health

Neural Information Processing SystemsJun-21-2026, 05:52:54 GMT

This position paper argues that post-deployment monitoring in clinical AI is underdeveloped and proposes statistically valid and label-efficient testing frameworks as a principled foundation for ensuring reliability and safety in real-world deployment. A recent review found that only 9% of FDA-registered AI-based healthcare tools include a post-deployment surveillance plan [1]. Existing monitoring approaches are often manual, sporadic, and reactive, making them ill-suited for the dynamic environments in which clinical models operate. We contend that post-deployment monitoring should be grounded in label-efficient and statistically valid testing frameworks, offering a principled alternative to current practices. We use the term "statistically valid" to refer to methods that provide explicit guarantees on error rates (e.g., Type I/II error), enable formal inference under pre-defined assumptions, and support reproducibility--features that align with regulatory requirements. Specifically, we propose that the detection of changes in the data and model performance degradation should be framed as distinct statistical hypothesis testing problems. Grounding monitoring in statistical rigor ensures a reproducible and scientifically sound basis for maintaining the reliability of clinical AI systems. Importantly, it also opens new research directions for the technical community--spanning theory, methods, and tools for statistically principled detection, attribution, and mitigation of post-deployment model failures in real-world settings.

artificial intelligence, data mining, machine learning, (16 more...)

Neural Information Processing Systems

Country: North America > United States (1.00)

Genre: Research Report > Experimental Study (1.00)

Industry:

Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (1.00)
Health & Medicine > Health Care Technology (1.00)
Health & Medicine > Diagnostic Medicine > Imaging (0.93)
Government > Regional Government > North America Government > United States Government > FDA (0.66)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
(2 more...)

Add feedback

IDOL: Meeting Diverse Distribution Shifts with Prior Physics for Tropical Cyclone Multi-Task Estimation

Neural Information Processing SystemsJun-20-2026, 17:36:29 GMT

Tropical Cyclone (TC) estimation aims to accurately estimate various TC attributes in real time.

distribution shift, machine learning, natural language, (20 more...)

Neural Information Processing Systems

Country:

North America > United States (0.46)
Asia (0.28)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry: Energy (0.94)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Data Science (0.93)
(2 more...)

Add feedback

Anytime-Valid Confirmation of Label-Shift Corrections

Choi, Seungjin

arXiv.org Machine LearningJun-15-2026

In small-batch scientific deployments, labeled target outcomes may be too scarce for reliable shift estimation even when unlabeled target inputs are available. We address the complementary setting where the practitioner has a pre-specified label-shift correction from domain knowledge and asks whether incoming labeled outcomes support it. We show that the per-observation likelihood ratio between a label-shift-corrected predictive and the source predictive is a conditional e-value, so its running product is a nonnegative martingale and Ville's inequality yields an anytime-valid confirmation rule. The log martingale equals the cumulative negative log-predictive density (NLPD) gap between the source and the corrected predictive, converting routine model monitoring into a formal sequential test. Rejection means the incoming data support the posited correction relative to the source predictive, but it is not a precise estimate of the degree of shift. Closed forms are available for GP sources with Gaussian label-shift ratios. GP regression simulations validate Type I control, finite-sample power, miscalibration sensitivity, and the small-batch advantage of a reliable prior over label-based re-estimation.

artificial intelligence, dtr, machine learning, (14 more...)

arXiv.org Machine Learning

2606.14028

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (0.69)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.68)

Add feedback

Aligning Evaluation with Clinical Priorities: Calibration, Label Shift, and Error Costs

Neural Information Processing SystemsJun-14-2026, 20:24:37 GMT

Machine learning-based decision support systems are increasingly deployed in clinical settings, where probabilistic scoring functions are used to inform and prioritize patient management decisions. However, widely used scoring rules, such as accuracy and AUC-ROC, fail to adequately reflect key clinical priorities, including calibration, robustness to distributional shifts, and sensitivity to asymmetric error costs. In this work, we propose a principled yet practical evaluation framework for selecting calibrated thresholded classifiers that explicitly accounts for uncertainty in class prevalences and domain-specific cost asymmetries. Building on the theory of proper scoring rules, particularly the Schervish representation, we derive an adjusted variant of cross-entropy (log score) that averages cost-weighted performance over clinically relevant ranges of class balance. The resulting evaluation is simple to apply, sensitive to clinical deployment conditions, and designed to prioritize models that are both calibrated and robust to real-world variations.

accuracy, decision support system, machine learning, (19 more...)

Neural Information Processing Systems

Country:

North America > United States > New York (0.46)
North America > United States > Massachusetts (0.28)

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.67)

Industry:

Health & Medicine > Diagnostic Medicine (1.00)
Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (0.93)

Technology:

Information Technology > Decision Support Systems (1.00)
Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
(3 more...)

Add feedback

AGeneralized Label Shift Perspective for Cross-Domain Gaze Estimation

Neural Information Processing SystemsJun-14-2026, 12:12:41 GMT

Aiming to generalize the well-trained gaze estimation model to new target domains, Cross-domain Gaze Estimation (CDGE) is developed for real-world application scenarios. Existing CDGE methods typically extract the domain-invariant features to mitigate domain shift in feature space, which is proved insufficient by Generalized Label Shift (GLS) theory. In this paper, we introduce a novel GLS perspective to CDGE and modelize the cross-domain problem by label and conditional shift problem. AGLS correction framework is presented and a feasible realization is proposed, in which an importance reweighting strategy based on truncated Gaussian distribution is introduced to overcome the continuity challenges in label shift correction. To embed the reweighted source distribution to conditional invariant learning, we further derive a probability-aware estimation of conditional operator discrepancy. Extensive experiments on standard CDGE tasks with different backbone models validate the superior generalization capability across domain and applicability on various models of proposed method.

artificial intelligence, estimation, machine learning, (17 more...)

Neural Information Processing Systems

Country: Asia > China (0.14)

Genre: Research Report > Experimental Study (1.00)

Industry: Education (0.48)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)

Add feedback

Minimum-Risk Recalibration of Classifiers

Neural Information Processing SystemsApr-29-2026, 23:57:49 GMT

Recalibrating probabilistic classifiers is vital for enhancing the reliability and accuracy of predictive models. Despite the development of numerous recalibration algorithms, there is still a lack of a comprehensive theory that integrates calibration and sharpness (which is essential for maintaining predictive power). In this paper, we introduce the concept of minimum-risk recalibration within the framework of mean-squared-error (MSE) decomposition, offering a principled approach for evaluating and recalibrating probabilistic classifiers. Using this framework, we analyze the uniform-mass binning (UMB) recalibration method and establish a finite-sample risk upper bound of order O(B/n+1/B2) where Bis the number of bins and nis the sample size. By balancing calibration and sharpness, we further determine that the optimal number of bins for UMB scales with n1/3, resulting in a risk bound of approximately O(n 2/3). Additionally, we tackle the challenge of label shift by proposing a two-stage approach that adjusts the recalibration function using limited labeled data from the target domain. Our results show that transferring a calibrated classifier requires significantly fewer target samples compared to recalibrating from scratch. We validate our theoretical findings through numerical simulations, which confirm the tightness of the proposed bounds, the optimal number of bins, and the effectiveness of label shift adaptation.

data mining, machine learning, natural language, (15 more...)

Neural Information Processing Systems

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language (0.93)
Information Technology > Data Science > Data Mining (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)

Add feedback

Online Label Shift: Optimal Dynamic Regret meets Practical Algorithms

Neural Information Processing SystemsApr-29-2026, 20:11:50 GMT

This paper focuses on supervised and unsupervised online label shift, where the class marginals Q(y) varies but the class-conditionals Q(x|y) remain invariant. In the unsupervised setting, our goal is to adapt a learner, trained on some offline labeled data, to changing label distributions given unlabeled online data. In the supervised setting, we must both learn a classifier and adapt to the dynamically evolving class marginals given only labeled online data. We develop novel algorithms that reduce the adaptation problem to online regression and guarantee optimal dynamic regret without any prior knowledge of the extent of drift in the label distribution. Our solution is based on bootstrapping the estimates of online regression oracles that track the drifting proportions. Experiments across numerous simulated and real-world online label shift scenarios demonstrate the superior performance of our proposed approaches, often achieving 1-3% improvement in accuracy while being sample and computationally efficient. Code is publicly available at this url.

algorithm, artificial intelligence, machine learning, (15 more...)

Neural Information Processing Systems

Country: North America > United States (0.46)

Genre:

Workflow (0.46)
Instructional Material (0.45)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Data Science (0.92)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.67)
(2 more...)

Add feedback